The AT&T WATSON Speech Recognizer
نویسندگان
چکیده
This paper describes the AT&T WATSON real-time speech recognizer, the product of several decades of research at AT&T. The recognizer handles a wide range of vocabulary sizes and is based on continuous-density hidden Markov models for acoustic modeling and finite state networks for language modeling. The recognition network is optimized for efficient search. We identify the algorithms used for high-accuracy, real-time and low-latency recognition. We present results for small and large vocabulary tasks taken from the AT&T VoiceTone R © service, showing word accuracy improvement of about 5% absolute and real-time processing speed-up by a factor between 2 and 3.
منابع مشابه
مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی
In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...
متن کاملALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition
One approach to robust speech recognition is to use a simple speech model to remove the distortion, before applying the speech recognizer. Previous attempts at this approach have relied on unimodal or point estimates of the noise for each utterance. In challenging acoustic environments, e.g., an airport, the spectrum of the noise changes rapidly during an utterance, making a point estimate a po...
متن کاملUnsupervised training of an HMM-based self-organizing unit recognizer with applications to topic classification and keyword discovery
We present our approach to unsupervised training of speech recognizers. Our approach iteratively adjusts sound units that are ptimized for the acoustic domain of interest. We thus enable the use of speech recognizers for applications in speech domains here transcriptions do not exist. The resulting recognizer is a state-of-the-art recognizer on the optimized units. Specifically we ropose buildi...
متن کاملAn effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation
This paper proposes an effective feature compensation scheme to address a real-life situation where clean speech database is not available for Gaussian Mixture Model (GMM) training for a model-based feature compensation method. The proposed scheme employs a Support Vector Machine (SVM)based model selection method to effectively generate the GMM for our feature compensation method directly from ...
متن کاملRapid Match Training for Large Vocabularies
This paper describes a new algorithm for building rapid match models for use in Dragon's continuous speech recognizer. Rather than working from a single representative token for each word, the new procedure works directly from a se t of trained hidden Markov models. By simulated traversals of the HMMs, we generate a collection of sample tokens for each word which are then averaged together to b...
متن کامل